Skip to content

feat(ingestion/snaplogic): Add snaplogic as a source for metadata ingestion #14231

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

SalimAbdul-snaplogic
Copy link

@SalimAbdul-snaplogic SalimAbdul-snaplogic commented Jul 25, 2025

Add new source for metadata ingestion.
In this PR I'm adding basic functionality to harvest lineage from snaplogic to datahub.

Mainly we want to load column/table level lineages. Apart from that we're mapping snaplogic pipeline to datahub pipeline.
Snaplogic pipeline is a big pipeline that contains a lot of small operations. So we create one Pipeline in the datahub that contains a lot of small tasks. Mapping is described in the generated documentation.

What we're not mapping: Snaplogic task and plex.

Currently we don't see performance issues but we will be testing this implementation on heavy loaded envs later.

@github-actions github-actions bot added ingestion PR or Issue related to the ingestion of metadata product PR or Issue related to the DataHub UI/UX community-contribution PR or Issue raised by member(s) of DataHub Community labels Jul 25, 2025
@datahub-cyborg datahub-cyborg bot added the needs-review Label for PRs that need review from a maintainer. label Jul 25, 2025
Copy link

codecov bot commented Jul 25, 2025

Bundle Report

Changes will increase total bundle size by 6.08kB (0.02%) ⬆️. This is within the configured threshold ✅

Detailed changes
Bundle name Size Change
datahub-react-web-esm 28.36MB 6.08kB (0.02%) ⬆️

Affected Assets, Files, and Routes:

view changes for bundle: datahub-react-web-esm

Assets Changed:

Asset Name Size Change Total Size Change (%)
assets/index-*.js 1.13kB 18.71MB 0.01%
assets/snaplogic-*.png (New) 4.95kB 4.95kB 100.0% 🚀

Files in assets/index-*.js:

  • ./src/app/ingestV2/source/builder/sources.json → Total Size: 33.58kB

  • ./src/app/ingest/source/builder/constants.ts → Total Size: 6.61kB

  • ./src/images/snaplogic.png → Total Size: 45 bytes

  • ./src/app/ingest/source/builder/sources.json → Total Size: 34.42kB

@shirshanka
Copy link
Contributor

Thanks for the contrib @SalimAbdul-snaplogic -
Could you provide in the PR description a summary of

  1. what assets you are mapping from Snaplogic to DataHub?
  2. And what asset types you are not mapping?
  3. Any issues / design challenges you faced in mapping Snaplogic concepts to DataHub concepts?
  4. Any performance issues that might exist in this current implementation?

thanks!

@datahub-cyborg datahub-cyborg bot added pending-submitter-response Issue/request has been reviewed but requires a response from the submitter and removed needs-review Label for PRs that need review from a maintainer. labels Jul 29, 2025
Copy link

codecov bot commented Jul 29, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@SalimAbdul-snaplogic
Copy link
Author

Thanks for the contrib @SalimAbdul-snaplogic - Could you provide in the PR description a summary of

  1. what assets you are mapping from Snaplogic to DataHub?
  2. And what asset types you are not mapping?
  3. Any issues / design challenges you faced in mapping Snaplogic concepts to DataHub concepts?
  4. Any performance issues that might exist in this current implementation?

thanks!

  1. Mainly we want to load column/table level lineages. Apart from that we're mapping snaplogic pipeline to datahub pipeline.
    Snaplogic pipeline is a big pipeline that contains a lot of small operations. So we create one Pipeline in the datahub that contains a lot of small tasks. Mapping is described in the documentation -
  2. We're not mapping: Snaplogic task and plex.

  3. Currently we don't see performance issues but we will be testing this implementation on heavy loaded envs later.

@datahub-cyborg datahub-cyborg bot added needs-review Label for PRs that need review from a maintainer. and removed pending-submitter-response Issue/request has been reviewed but requires a response from the submitter labels Aug 1, 2025
@yoonhyejin
Copy link
Collaborator

yoonhyejin commented Aug 4, 2025

Hello, thanks for raising the PR! Seems like there's failing lint. could you run this and push the changes? It should auto-fix most issues, but a few may need to be fixed manually.

./gradlew :datahub-web-react:yarnLintFix

you can locally validate lint by this command :

./gradlew :datahub-web-react:yarnLint

@treff7es could you review this PR? thanks!

@yoonhyejin yoonhyejin requested a review from treff7es August 4, 2025 06:53
@datahub-cyborg datahub-cyborg bot added pending-submitter-response Issue/request has been reviewed but requires a response from the submitter needs-review Label for PRs that need review from a maintainer. and removed needs-review Label for PRs that need review from a maintainer. pending-submitter-response Issue/request has been reviewed but requires a response from the submitter labels Aug 4, 2025
@datahub-cyborg datahub-cyborg bot added pending-submitter-response Issue/request has been reviewed but requires a response from the submitter and removed needs-review Label for PRs that need review from a maintainer. labels Aug 5, 2025
@SalimAbdul-snaplogic
Copy link
Author

SalimAbdul-snaplogic commented Aug 8, 2025

Hello, thanks for raising the PR! Seems like there's failing lint. could you run this and push the changes? It should auto-fix most issues, but a few may need to be fixed manually.

./gradlew :datahub-web-react:yarnLintFix

you can locally validate lint by this command :

./gradlew :datahub-web-react:yarnLint

@treff7es could you review this PR? thanks!

Thank you.
Those commands helped to fix lint issues

@datahub-cyborg datahub-cyborg bot added needs-review Label for PRs that need review from a maintainer. and removed pending-submitter-response Issue/request has been reviewed but requires a response from the submitter labels Aug 8, 2025
@sonivishal
Copy link

Hi @treff7es Need your help with review here. Please let us know if any questions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community-contribution PR or Issue raised by member(s) of DataHub Community ingestion PR or Issue related to the ingestion of metadata needs-review Label for PRs that need review from a maintainer. product PR or Issue related to the DataHub UI/UX
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants